Nowadays, due to the widespread use of smartphones in everyday life and the improvement of computational capabilities of these devices, many complex tasks can now be deployed on them. Concerning the need for continuous monitoring of vital signs, especially for the elderly or those with certain types of diseases, the development of algorithms that can estimate vital signs using smartphones has attracted researchers worldwide. Such algorithms estimate vital signs (heart rate and oxygen saturation level) by processing an input PPG signal. These methods often apply multiple pre-processing steps to the input signal before the prediction step. This can increase the computational complexity of these methods, meaning only a limited number of mobile devices can run them. Furthermore, multiple pre-processing steps also require the design of a couple of hand-crafted stages to obtain an optimal result. This research proposes a novel end-to-end solution to mobile-based vital sign estimation by deep learning. The proposed method does not require any pre-processing. Due to the use of fully convolutional architecture, the parameter count of our proposed model is, on average, a quarter of the ordinary architectures that use fully-connected layers as the prediction heads. As a result, the proposed model has less over-fitting chance and computational complexity. A public dataset for vital sign estimation, including 62 videos collected from 35 men and 27 women, is also provided. The experimental results demonstrate state-of-the-art estimation accuracy.
translated by 谷歌翻译
With the advent of deep learning application on edge devices, researchers actively try to optimize their deployments on low-power and restricted memory devices. There are established compression method such as quantization, pruning, and architecture search that leverage commodity hardware. Apart from conventional compression algorithms, one may redesign the operations of deep learning models that lead to more efficient implementation. To this end, we propose EuclidNet, a compression method, designed to be implemented on hardware which replaces multiplication, $xw$, with Euclidean distance $(x-w)^2$. We show that EuclidNet is aligned with matrix multiplication and it can be used as a measure of similarity in case of convolutional layers. Furthermore, we show that under various transformations and noise scenarios, EuclidNet exhibits the same performance compared to the deep learning models designed with multiplication operations.
translated by 谷歌翻译
In this paper, we present methods for two types of metacognitive tasks in an AI system: rapidly expanding a neural classification model to accommodate a new category of object, and recognizing when a novel object type is observed instead of misclassifying the observation as a known class. Our methods take numerical data drawn from an embodied simulation environment, which describes the motion and properties of objects when interacted with, and we demonstrate that this type of representation is important for the success of novel type detection. We present a suite of experiments in rapidly accommodating the introduction of new categories and concepts and in novel type detection, and an architecture to integrate the two in an interactive system.
translated by 谷歌翻译
This paper presents a learning framework to estimate an agent capability and task requirement model for multi-agent task allocation. With a set of team configurations and the corresponding task performances as the training data, linear task constraints can be learned to be embedded in many existing optimization-based task allocation frameworks. Comprehensive computational evaluations are conducted to test the scalability and prediction accuracy of the learning framework with a limited number of team configurations and performance pairs. A ROS and Gazebo-based simulation environment is developed to validate the proposed requirements learning and task allocation framework in practical multi-agent exploration and manipulation tasks. Results show that the learning process for scenarios with 40 tasks and 6 types of agents uses around 12 seconds, ending up with prediction errors in the range of 0.5-2%.
translated by 谷歌翻译
在不同的运动模式之间切换(例如,楼梯上升/下降,坡道上升/下降)时,动力的假肢腿必须预见用户的意图。许多数据驱动的分类技术已经证明了预测用户意图的有希望的结果,但是这些意图预测模型对新主题的表现仍然不受欢迎。在其他域(例如,图像分类)中,通过从大型数据集(即预训练的模型)中使用先前学习的功能,然后将此学模型转移到可用的新任务中,可以提高转移学习的精度。在本文中,我们开发了一个基于人类运动数据集的内部受试者(受试者)和主体间(主体独立)验证的深卷卷神经网络。然后,我们使用剩下的主题中的一小部分(10%)将转移学习应用于主题独立的模型。我们比较了这三个模型的性能。我们的结果表明,转移学习(TL)模型的表现优于主题无关(IND)模型,并且与主题依赖性(DEP)模型(DEP错误:0.74 $ \ pm $ 0.002%,IND错误:11.59 $ \ \ PM $ 0.076%,TL错误:3.57 $ \ pm $ 0.02%,有10%的数据)。此外,正如预期的那样,随着剩余主题的更多数据的可用性,转移学习精度会提高。我们还通过各种传感器配置评估了意图预测系统的性能,这些传感器配置可能会在假肢应用程序中可用。我们的结果表明,假体的大腿IMU足以预测实践中的运动意图。
translated by 谷歌翻译
机器人的感知目前处于在有效的潜在空间中运行的现代方法与数学建立的经典方法之间的跨道路,并提供了可解释的,可信赖的结果。在本文中,我们引入了卷积的贝叶斯内核推理(Convbki)层,该层在可分离的卷积层中明确执行贝叶斯推断,以同时提高效率,同时保持可靠性。我们将层应用于3D语义映射的任务,在该任务中,我们可以实时学习激光雷达传感器信息的语义几何概率分布。我们根据KITTI数据集的最新语义映射算法评估我们的网络,并通过类似的语义结果证明了延迟的提高。
translated by 谷歌翻译
3D重建问题中的一个关键问题是如何训练机器人或机器人以模型3D对象。在实时系统(例如自动驾驶汽车)中导航等许多任务直接取决于此问题。这些系统通常具有有限的计算能力。尽管近年来3D重建系统在3D重建系统中取得了长足的进展,但由于现有方法的高复杂性和计算需求,将它们应用于自动驾驶汽车中的导航系统等实时系统仍然具有挑战性。这项研究解决了以更快(实时)方式重建单视图像中显示的对象的当前问题。为此,开发了一个简单而强大的深度神经框架。提出的框架由两个组件组成:特征提取器模块和3D发电机模块。我们将点云表示为我们的重建模块的输出。将Shapenet数据集用于将方法与计算时间和准确性方面的现有结果进行比较。模拟证明了所提出的方法的出色性能。索引术语现实时间3D重建,单视图重建,监督学习,深神经网络
translated by 谷歌翻译
基于变压器的模型用于实现各种深度学习任务的最新性能。由于基于变压器的模型具有大量参数,因此在下游任务上进行微调是计算密集型和饥饿的能量。此类型号的自动混合精液FP32/FP16微调以前已用于降低计算资源需求。但是,随着低位整数背面传播的最新进展,有可能进一步减少计算和记忆脚印。在这项工作中,我们探索了一种新颖的整数训练方法,该方法使用整数算术来进行正向传播和梯度计算,对基于变压器的模型中的线性,卷积,层和层和嵌入层的梯度计算。此外,我们研究了各种整数位宽度的效果,以找到基于变压器模型的整数微调所需的最小位宽度。我们使用整数层对流行的下游任务进行了微调和VIT模型。我们表明,16位整数模型与浮点基线性能匹配。将位宽度降低到10,我们观察到0.5平均得分下降。最后,将位宽度的进一步降低到8的平均得分下降为1.7分。
translated by 谷歌翻译
这项工作通过建立最近提出的轨迹排名最大的熵深逆增强学习(T-Medirl),为拥挤的环境中具有社会意识的本地规划师的新框架提出了一个新的框架。为了解决社会导航问题,我们的多模式学习计划者明确考虑了社会互动因素以及社会意识因素,以从T-Medirl Pipeline中学习,以从人类的示范中学习奖励功能。此外,我们建议使用机器人周围行人的突然速度变化来解决人类示范中的亚临时性。我们的评估表明,这种方法可以成功地使机器人在拥挤的社交环境中导航,并在成功率,导航时间和入侵率方面胜过最先进的社会导航方法。
translated by 谷歌翻译
深度学习模型的计算复杂性不断增加,使他们在各种云和边缘平台上的培训和部署变得困难。用低位整数算术代替浮点算术是一种有希望的方法,可节省能量,记忆足迹和深度学习模型的延迟。因此,量化引起了近年来研究人员的注意。但是,没有详细研究使用整数数字形成功能齐全的整数训练管道,包括前进,后传播和随机梯度下降。我们的经验和数学结果表明,整数算术足以训练深度学习模型。与最近的建议不同,我们直接切换计算的数字表示。我们的新型训练方法形成了完全整数训练管道,与浮点相比,它不会改变损失和准确性的轨迹,也不需要任何特殊的超参数调整,分配调整或梯度剪辑。我们的实验结果表明,我们提出的方法在各种任务(包括视觉变压器),对象检测和语义分割等多种任务中有效。
translated by 谷歌翻译